feat: add cloudflare-metrics worker for graphql analytics export by zackpollard · Pull Request #28 · immich-app/services

zackpollard · 2026-04-10T15:16:53Z

Adds a new Cloudflare Worker that runs on a 5-minute cron, queries the
Cloudflare GraphQL Analytics API for every resource type we currently
use (and several we don't yet), and pushes the data into VictoriaMetrics
via the existing InfluxDB line-protocol endpoint.

What's collected

20 datasets, each mapped to a cf_* measurement with snake_case tags and fields:

Workers: cf_workers_invocations, cf_workers_subrequests, cf_workers_overview
D1: cf_d1_queries, cf_d1_storage
R2: cf_r2_operations, cf_r2_storage
KV: cf_kv_operations, cf_kv_storage
Durable Objects: cf_durable_objects_invocations, cf_durable_objects_periodic, cf_durable_objects_storage, cf_durable_objects_sql_storage, cf_durable_objects_subrequests
Queues: cf_queue_operations, cf_queue_backlog
Hyperdrive: cf_hyperdrive_queries, cf_hyperdrive_pool
HTTP zones: cf_http_requests_overview
Pages Functions: cf_pages_functions_invocations

All points are tagged with account_id and written with the Cloudflare
bucket timestamp so historical backfills land in the right place.

Structure

Mirrors the existing version worker:

src/metrics.ts extends the shared pattern with floatField and a custom
export timestamp so analytics values don't get truncated to integers.
src/graphql-client.ts — typed wrapper over the Cloudflare GraphQL API
using a single JSON filter variable (works around the per-dataset
filter input types).
src/datasets.ts — single registry describing every dataset's dimensions,
aggregation blocks, and tag/field projection. Adding a new dataset is
one entry.
src/collector.ts — fetches each dataset, converts rows to Metric
points, and records a self-observation per dataset (cloudflare_metrics_collector_dataset).
src/index.ts — fetch handler for /health and /collect (manual trigger)
plus the scheduled() cron entry point.

Testing

Unit tests (32 tests, pnpm run test): line protocol formatting,
query builder, variable builder, GraphQL client error paths, collector
dimension/field projection, dataset registry invariants, HTTP handler.
Integration tests (pnpm run test:integration, gated on
CLOUDFLARE_API_TOKEN + CLOUDFLARE_ACCOUNT_ID): every dataset query
is executed against the real Cloudflare API and validated to match the
parser's expected shape, plus a full collector run. All 20 datasets
succeed; the first run against our production account emitted
11,330 metric points.

Deployment

New Terraform module at deployment/modules/cloudflare/workers/cloudflare-metrics/:

api-token.tf — provisions a scoped read-only Cloudflare API token with
"Account Analytics Read" permission via cloudflare_api_token.
worker.tf — worker, version, deployment, and a */5 * * * * cron trigger.

No custom domain — the worker is only triggered by cron.

github-actions · 2026-04-10T15:33:06Z

Preview Deployments (01b5219)

Worker	Preview URL
github-approval-check	https://github-approval-check.pr-28.dev.immich.app
hello	https://hello.pr-28.dev.immich.cloud
version	https://version.pr-28.dev.immich.cloud

zackpollard · 2026-04-10T15:37:54Z

Deployment status

CI has deployed the worker + dashboard to dev (PR-28 stage):

Worker: cloudflare-metrics-api-dev-pr-28 with */5 * * * * cron trigger
Grafana dashboard: Cloudflare Account Overview uploaded to the cloudflare-metrics (pr-28) folder in dev Grafana

The worker is currently no-op on every cron tick because the analytics API token secret isn't wired up yet — TF_VAR_cloudflare_analytics_api_token is commented out in deployment/.env (see commit 6154677) and the worker defaults to an empty string, logging a cron_error{reason="missing_config"} self-metric.

Follow-up to start data collection

To make the collector start emitting data to VictoriaMetrics/Grafana:

Create a new Cloudflare API token in the dashboard with only the Account Analytics Read permission group, scoped to the target account.
Store it in 1Password as CLOUDFLARE_METRICS_ANALYTICS_TOKEN in both the tf_dev and tf_prod vaults.

Uncomment the line in deployment/.env:

export TF_VAR_cloudflare_analytics_api_token="op://tf_$ENVIRONMENT/CLOUDFLARE_METRICS_ANALYTICS_TOKEN/password"

Re-run CI (push any commit). The next cron tick within 5 minutes will start emitting cf_* metrics.

Why the token isn't generated via Terraform

The cleanest approach would be resource "cloudflare_api_token" { ... } with a hardcoded permission group UUID. I tried that (commit 65e0cf2) but the Terraform service account token doesn't have the User → API Tokens → Write permission required to call POST /user/tokens — returned 403. Both the data-source lookup and the resource creation hit the same endpoint. Getting this fully automated would need either (a) granting the Terraform service account the user-token-write permission, or (b) a dedicated bootstrap token stored out-of-band.

Integration test coverage

pnpm run test:integration is gated on CLOUDFLARE_API_TOKEN + CLOUDFLARE_ACCOUNT_ID and walks every dataset against the live API. Local run against the prod account: 20/20 datasets succeeded, 11,330 metric points emitted across a 1h window. Once the dev token is in place the same tests can run in CI.

zackpollard · 2026-04-10T17:00:29Z

✅ Pipeline is now live end-to-end

Terraform now owns the API token

Following the devtools api-keys pattern, the analytics token is provisioned via cloudflare_api_token using a second aliased provider (cloudflare.bootstrap) that authenticates with the user-level var.cloudflare_api_token instead of the account-scoped token. No manual 1Password step required.

One wrinkle: Cloudflare provider v5 has a bug (#5045) where cloudflare_api_token.value is only populated immediately after creation and gets wiped on refresh. To work around it, there's a terraform_data.analytics_token_generation trigger — bumping its input forces the token to be destroyed and recreated on the next apply, which re-populates .value in the same plan/apply cycle, and the worker binding picks up the fresh value before the worker_version is committed.

The real bug

What took most of the debugging: my first version of the GraphQL client had private readonly fetchImpl: typeof fetch = fetch as a constructor default. In the Workers scheduled runtime that default captured undefined, so every fetchDataset call blew up with a synchronous TypeError before ever reaching the network. The worker was running through all 20 datasets in ~290ms, pushing 20 error metrics + summary to VictoriaMetrics, and that was it — no GraphQL calls at all.

Fix: look up globalThis.fetch lazily at the call site (9303520).

Verified working

Latest cron tick at 16:55:52 UTC:

subrequests: 22 (20 to api.cloudflare.com + 1 metric flush + 1 leftover diagnostic beacon, now removed)
cpuTimeUs: 98053, wallTime: 13.3s
api.cloudflare.com: 20 requests, all 200 OK, 617 KB of response body
VictoriaMetrics flush: 609 KB request body, 204 OK

That's the full 20-dataset collection making it end-to-end. The dev Grafana Cloudflare Account Overview dashboard in the cloudflare-metrics (pr-28) folder should start showing data within the next couple of cron cycles.

zackpollard · 2026-04-10T17:33:55Z

✅ Resource name enrichment (D1, queues, zones)

4822b0c + 9d8d8e8 deployed. At the start of each cron tick, the collector now fetches:

GET /accounts/{id}/d1/database → database_name tag on cf_d1_* metrics
GET /accounts/{id}/queues → queue_name tag on cf_queue_* metrics
GET /zones?account.id={id} → zone_name tag on cf_http_requests_overview metrics

Both the *_id / *_tag and the new *_name tags are emitted, so existing queries keep working and the human-readable name is a strict addition. The Grafana dashboard now groups by name with id as a secondary label.

If any of the three lookups fails it's reported via a cloudflare_metrics_resource_lookup{resource,status,error} self-metric; collection still proceeds with whatever caches are populated.

Token permissions

The cloudflare_api_token now includes D1 Read and Queues Read alongside Account Analytics Read. Permission group UUIDs are looked up dynamically via data.cloudflare_api_token_permission_groups_list on the cloudflare.bootstrap provider — that token already has user-level permissions so the lookup works. Zones list doesn't need a dedicated permission.

Verified on `17:30:52` cron tick

subrequests: 24 → 20 GraphQL + 3 REST (D1/queues/zones) + 1 VictoriaMetrics flush
All 23 api.cloudflare.com calls returned 200
CPU 100ms, wall 15.1s

Dashboard legends should now show real names in dev Grafana within the next cycle.

zackpollard · 2026-04-10T18:01:04Z

Zones: per-tag lookup for Pages projects

Root cause: GET /zones?account.id=X only returns the 6 registered zones. The 17 distinct zoneTags in httpRequestsOverviewAdaptiveGroups include ~15 Cloudflare Pages project zones (e.g. immich-app-archive.pages.dev) which aren't in the bulk list but ARE resolvable via GET /zones/{id}.

Fix

CloudflareRestClient.getZone(zoneId) added to fetch a single zone by id (returns null on 404 so we can tolerate deleted zones).
CloudflareMetricsCollector.resolveMissingZones(rows) runs after fetching the HTTP overview data: it extracts any zoneTag not already in the cache, fetches each in parallel with Promise.allSettled, and emits a cloudflare_metrics_resource_lookup{resource="zones_individual",status,requested,resolved,failed} self-metric.
applyResourceTags now falls back to zone_name = zoneTag when the lookup failed entirely, so dashboard legends are never empty.
Token gained Zone Read in a second policy block scoped to com.cloudflare.api.account.zone.*. (First attempt nested the resource under the account scope and Cloudflare returned "Partial wildcard scope can only be followed by a match-all object expression" — fixed by using "*" directly.)

Verified on the 17:55 tick

41 calls to api.cloudflare.com — all 200 OK (was 23 × 200 + 20 × 403 before)
20 GraphQL + 3 bulk REST + 17 individual zone lookups + 1 metric flush = 41
All cf_http_requests_overview rows now carry zone_name — either the real name for Pages zones or the bulk-listed account zones, or the zoneTag as a last resort

Dashboard legends should now show real hostnames like immich-app-archive.pages.dev within the next cron cycle.

zackpollard · 2026-04-10T20:28:30Z

✅ 1-minute granularity live, batched, and working

Final results on the 20:25:52 cron tick:

Metric	Before batching	After batching (cold)	After batching (warm)
Subrequests / tick	~67 (threw at 50)	27	9
Status	`scriptThrewException`	`success`	`success`
CPU	151 ms	129 ms	110 ms
Wall	17.5 s	6.8 s	5.5 s
Flush body	0 KB (threw before flush)	1.58 MB	1.6 MB

New datasets shipped

cf_d1_queries_detail — per-query counts/rows/duration (p50/p95/p99) from d1QueriesAdaptiveGroups, grouped by database_id/database_role/error
cf_queue_consumer — consumer concurrency_avg per queue from queueConsumerMetricsAdaptiveGroups
cf_http_requests_detail — per-zone detailed HTTP breakdown (count, bytes, visits, timing) from httpRequestsAdaptiveGroups grouped by country/method/status/cache/protocol (zone-scoped)
cf_workers_scheduled — cron-specific invocation stats from workersInvocationsScheduled, aggregated client-side into (script, cron, status, minute) buckets with invocations/cpu_time sum/avg/max

firewallEventsAdaptiveGroups is excluded — gated on Business/Enterprise plans and returns authz/"does not have access to the path" at both account and zone scopes on our plan. A comment in datasets.ts documents how to add it if we upgrade.

Granularity

Dropped from datetimeFiveMinutes to datetimeMinute across all 24 datasets. That's the finest grouping the Cloudflare Analytics API exposes — going below 1-minute would require the raw datetime sample stream, which is not aggregated and would explode cardinality. The collector window widened to 12 minutes (still overlapping the previous run by 7 minutes so one missed cron doesn't lose data).

Batching architecture

Three design changes in graphql-client.ts:

fetchAccountBatch — all account-scope datasets that share a filter granularity are emitted as aliased fields under one viewer.accounts query. workersInvocationsScheduled rides along as a workers_scheduled alias in the datetime batch. Partial responses (one field erroring while others succeed) are preserved via executeAllowPartial + groupErrorsByAlias.
fetchZoneBatch — all bulk-listed zones for a zone-scoped dataset get aliased zones(filter: {zoneTag: "..."}) blocks in a single query. Zone tags are validated against /[a-zA-Z0-9_-]+/ before being inlined to block injection.
globalZoneNameCache module-level map persists Pages zone name lookups across Worker isolate invocations. Cold-start costs ~20 /zones/{id} lookups (throttled); warm isolates skip them entirely.

… victoriametrics

…ng it via terraform

…w collector

…ng bootstrap provider

…vive provider refresh

…e after v5 state wipe

…rigger

29999 was the wrong workaround — the real issue was the bundled usage_model, not the value being out of range. 30000 works fine.

The curl was using -f which fails the whole terraform apply if the PATCH returns any HTTP error. The service-env settings are sticky once set, so we don't actually need to re-PATCH on every deploy — make it best-effort and just log the response. Also forces a fresh isolate via the new version, unblocking the worker which has been hitting CPU exceedances on a long-lived isolate.

Captures all console.log and runtime traces in Cloudflare's Observability dashboard so we can see what was happening during CPU-exceeded incidents (logs are otherwise lost when the invocation is killed). 100% sampling so we don't miss anything during debugging; can lower the head_sampling_rate later if log volume becomes a concern.

Observability data shows Cloudflare drops/delays about 25% of our cron triggers. When it recovers it fires the missed triggers as a burst (observed up to 10 at once) all at the same wall clock. With Date.now() as the query anchor, every catch-up invocation ran the exact same query and the originally-scheduled minutes were lost. Fix by anchoring the query window to controller.scheduledTime so each catch-up invocation queries the window it was originally scheduled for. Also bumps DEFAULT_WINDOW_MS from 3 → 5 min so each minute is covered by ~6 consecutive ticks instead of 4 — with a 25% miss rate the probability of all 6 ticks missing drops from 0.4% to 0.024%, which should essentially eliminate the gaps. VictoriaMetrics dedupes on (series, timestamp) so the extra overlap is free on the storage side.

Two bugs: (1) sum by (status) — the metric's label is http_response_status, not status. Everything was collapsing into a single empty-key series. (2) rate() on what's effectively a per-minute sum gauge, not a monotonic counter — the raw sample values are independent per minute, so rate's counter-reset handling isn't meaningful. Changed to a direct per-minute sum grouped by http_response_status. If still gappy after this, investigate whether the underlying samples are actually missing in VM.

Our collector emits per-minute gauge values (sum of requests/errors/ subrequests within each minute), not monotonic counters. rate() on these produces display artifacts — apparent gaps when the per-second derivative can't be computed meaningfully between non-monotonic samples. Switching to raw metric display shows the actual per-minute counts directly. Fixes reported gaps in subrequests-per-script for version-api-prod between 07:17-07:21 where Cloudflare source data confirmed all minutes had ~3000 subrequests and all cron ticks ran successfully.

All self-telemetry metrics are per-tick gauges emitted every minute. rate() is wrong on these (not counters), and 5m/10m increase() windows are unnecessarily wide for 1-minute data. Changed: - rate(metric[5m]) → raw metric (rows, points, lookup counts, HTTP) - increase(metric[5m]) → increase(metric[1m]) (errors, flush errors) - increase(metric[10m]) → increase(metric[1m]) (cron exceptions)

Per-minute gauge data displayed as disconnected points looks like random spikes. Set spanNulls=true and lineInterpolation=smooth on all 9 timeseries panels so data points connect into a continuous trend line.

Comprehensive fix across all 19 dashboards: - 84 rate(cf_...[Xm]) → raw metric: our collector emits per-minute gauge values, not monotonic counters. rate() on these produced erratic spikes whenever traffic dropped between minutes (interpreted as counter resets). Raw metric display shows actual per-minute counts. - 2 increase() windows narrowed from 5m/10m → 1m to match our 1-minute cron interval. - 132 timeseries panels styled: spanNulls=true, lineInterpolation=smooth, showPoints=never. Per-minute data points now connect into smooth trend lines instead of appearing as disconnected spikes.

…lo writes Cloudflare routes cron triggers to multiple colos simultaneously. Each colo queries the GraphQL analytics API and writes to VictoriaMetrics. Due to eventual consistency, different colos can return different aggregation counts for the same minute — one colo might see 3000 subrequests while another only sees 450 (partial data). VM's last-write-wins causes the stored value to oscillate between the competing writes, producing 6-7x swings on the dashboard. Fix by wrapping all 128 timeseries metric queries with max_over_time(metric[2m]). This takes the highest value for each sub-series over a 2-minute window, ensuring the most complete colo's data wins regardless of write order. For sum/count metrics, max picks the most complete data. For max/p99 metrics, max is also correct. Instant queries (stat panels using increase([24h])) are excluded since they aggregate over long windows where the oscillation averages out.

Our metrics are per-minute gauges (value = count within that minute), not monotonic counters. increase() computes (last - first) which is meaningless for gauges — it could return near-zero for "total requests in 24h" even when there were millions. Changed: - 65 stat panels: increase(metric[24h]) → sum_over_time(max_over_time(metric[1m])[24h:1m]) Correctly totals all per-minute values over the window. - 20 billing queries: increase(metric[1h]) → sum_over_time(metric[1h:1m]) Correctly computes hourly cost from per-minute data. - 6 self-telemetry queries: increase(metric[1m]) → raw metric or sum_over_time Zero rate() or increase() remaining across all 19 dashboards. Every cf_ metric query now uses the appropriate *_over_time wrapper: - max_over_time: timeseries (multi-colo dedup) - sum_over_time: totals (stat panels, billing, error tables) - last_over_time: storage snapshots (carry forward)

Panels were labeled with per-second units (reqps, ops, Bps) from when queries used rate(). Now that we display raw per-minute gauge values: - 42 panels: reqps → short (per-minute count, not per-second) - 9 panels: ops → short (same) - 4 panels: Bps → bytes (per-minute byte total, not bytes/sec) - 13 panel titles: removed "Rate" since we show counts not rates

- Guard division by zero in scheduled worker CPU avg (collector.ts) - Skip NaN/Infinity values in InfluxDB line protocol serialization - Remove dead applyResourceTags() function from emit.ts - Use max() instead of sum() in alert PromQL for multi-colo safety - Improve curl PATCH logging in worker.tf to surface failures - Add test suites: emit, metric-providers (escaping + NaN), flush-state, resource-cache, scheduled handler window logic

…-deploy PATCH the cloudflare_worker_version resource does support usage_model (deprecated but functional); setting it on the version itself is the durable fix. the prior post-deploy service-env PATCH was unreliable — after commit 3eda62b the worker ran standard for ~45 min then reverted to bundled on its own, causing 2+ hours of exceededCpu crons.

empirically verify what usage_model new cloudflare workers default to at the version level. scheduled handler burns >50ms of cpu so if the default is `bundled` (50ms cap) the cron will die with `exceededCpu`, and if `standard` it completes with `outcome=ok`. no usage_model field is set on cloudflare_worker_version in this commit — a follow-up will add `usage_model = "standard"` to confirm the fix.

commit a proved new workers default to bundled (50ms cap) — first cron on cpu-test-api-dev-pr-28 fired exceededCpu at cpu=50ms wall=51ms. add usage_model = "standard" to cloudflare_worker_version and re-verify via wrangler tail that cron outcome flips to ok with cpu > 50ms.

experiment confirmed empirically: - new cloudflare workers deployed via terraform default to usage_model=bundled (50ms cpu cap). cpu-test first cron: exceededCpu at cpu=50ms wall=51ms. - setting usage_model="standard" on cloudflare_worker_version lifts the cap. same handler, new version: cpu=2050ms wall=2105ms — 41x the bundled cap. - the worker has been deleted from cloudflare via the api; postgres tf state schema services_cf_workers_cpu-test_dev_pr-28 is orphaned but harmless (module dir removed so terragrunt run-all won't discover it).

re-create cpu-test worker with usage_model=standard + limits.cpu_ms=30000 to see if it also reverts to bundled after some time. bump cloudflare-metrics source to force a new version (the current one reverted to bundled after ~2 hours of stable operation). will poll the usage_model field to catch exactly when it flips.

previous deploy (ea725556) at 21:58 UTC reverted to bundled at 23:06 UTC (~1h8m post-deploy). bump FORCE_NEW_VERSION to trigger a new version and confirm both (a) redeploy fixes the 50ms cap immediately, and (b) the ~1 hour revert pattern repeats on the new version.

- prettier-format resource-cache.test.ts - pass explicit undefined to normalizeTagValue to satisfy TS2554 - add --passWithNoTests to cpu-test vitest (experimental worker has no tests)

- delete apps/cpu-test and its terraform module — experiment complete - drop FORCE_NEW_VERSION redeploy trigger from cloudflare-metrics

zackpollard force-pushed the feat/cloudflare-metrics-exporter branch 2 times, most recently from ef0a947 to 7c7d1d7 Compare April 10, 2026 16:21

github-advanced-security AI found potential problems Apr 11, 2026

View reviewed changes

Comment thread .github/workflows/test.yml Fixed

Comment thread .github/workflows/test.yml Fixed

Comment thread .github/workflows/test.yml Fixed

github-advanced-security AI found potential problems Apr 11, 2026

View reviewed changes

Comment thread .github/workflows/test.yml Fixed

github-advanced-security AI found potential problems Apr 12, 2026

View reviewed changes

Comment thread .github/workflows/test.yml Fixed

zackpollard force-pushed the feat/cloudflare-metrics-exporter branch 8 times, most recently from e0a6644 to 22b8202 Compare April 17, 2026 20:23

zackpollard added 11 commits April 22, 2026 17:47

feat: add cloudflare-metrics worker that exports graphql analytics to…

8ae7070

… victoriametrics

feat(cloudflare-metrics): add grafana overview dashboard

483c6f2

fix(cloudflare-metrics): hardcode analytics read permission group uuid

547c1e3

fix(cloudflare-metrics): inject analytics api token instead of creati…

f993bca

…ng it via terraform

chore(cloudflare-metrics): defer analytics token to follow-up

c22bbdc

chore(cloudflare-metrics): add empty preview_url output for ci previe…

8277172

…w collector

feat(cloudflare-metrics): provision analytics token via terraform usi…

fb0fe8f

…ng bootstrap provider

fix(cloudflare-metrics): pin api token value in terraform_data to sur…

1f25745

…vive provider refresh

fix(cloudflare-metrics): force token recreation to capture fresh valu…

c2ab590

…e after v5 state wipe

fix(cloudflare-metrics): force api token replacement via generation t…

c1a57fa

…rigger

chore(cloudflare-metrics): bump token generation to trigger rotation

811d5b7

zackpollard added 22 commits April 22, 2026 17:47

chore: set cpu_ms back to 30000

dd50bc6

29999 was the wrong workaround — the real issue was the bundled usage_model, not the value being out of range. 30000 works fine.

fix: connect data points in workers dashboard timeseries panels

df19149

Per-minute gauge data displayed as disconnected points looks like random spikes. Set spanNulls=true and lineInterpolation=smooth on all 9 timeseries panels so data points connect into a continuous trend line.

fix: use 1m window for max_over_time (was 2m)

f57810e

chore: trigger redeploy to test usage_model regression

c33675c

test: redeploy cloudflare-metrics after cloudflare runtime fix

7ba393a

zackpollard force-pushed the feat/cloudflare-metrics-exporter branch from 5a61fc0 to 4fd3d95 Compare April 22, 2026 16:47

fix: repair failing format, tsc, and unit test checks

b53dfc5

- prettier-format resource-cache.test.ts - pass explicit undefined to normalizeTagValue to satisfy TS2554 - add --passWithNoTests to cpu-test vitest (experimental worker has no tests)

zackpollard force-pushed the feat/cloudflare-metrics-exporter branch from 4fd3d95 to b53dfc5 Compare April 22, 2026 16:55

chore: remove cpu-test worker and FORCE_NEW_VERSION debug constant

01b5219

- delete apps/cpu-test and its terraform module — experiment complete - drop FORCE_NEW_VERSION redeploy trigger from cloudflare-metrics

zackpollard marked this pull request as ready for review April 22, 2026 17:09

jrasm91 approved these changes Apr 22, 2026

View reviewed changes

zackpollard merged commit ebeb6d6 into main Apr 22, 2026
10 checks passed

zackpollard deleted the feat/cloudflare-metrics-exporter branch April 22, 2026 17:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add cloudflare-metrics worker for graphql analytics export#28

feat: add cloudflare-metrics worker for graphql analytics export#28
zackpollard merged 109 commits intomainfrom
feat/cloudflare-metrics-exporter

zackpollard commented Apr 10, 2026

Uh oh!

github-actions Bot commented Apr 10, 2026 •

edited

Loading

Uh oh!

zackpollard commented Apr 10, 2026

Uh oh!

zackpollard commented Apr 10, 2026

Uh oh!

zackpollard commented Apr 10, 2026

Uh oh!

zackpollard commented Apr 10, 2026

Uh oh!

zackpollard commented Apr 10, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

zackpollard commented Apr 10, 2026

What's collected

Structure

Testing

Deployment

Uh oh!

github-actions Bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Preview Deployments (01b5219)

Uh oh!

zackpollard commented Apr 10, 2026

Deployment status

Follow-up to start data collection

Why the token isn't generated via Terraform

Integration test coverage

Uh oh!

zackpollard commented Apr 10, 2026

✅ Pipeline is now live end-to-end

Terraform now owns the API token

The real bug

Verified working

Uh oh!

zackpollard commented Apr 10, 2026

✅ Resource name enrichment (D1, queues, zones)

Token permissions

Verified on 17:30:52 cron tick

Uh oh!

zackpollard commented Apr 10, 2026

Zones: per-tag lookup for Pages projects

Fix

Verified on the 17:55 tick

Uh oh!

zackpollard commented Apr 10, 2026

✅ 1-minute granularity live, batched, and working

New datasets shipped

Granularity

Batching architecture

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions Bot commented Apr 10, 2026 •

edited

Loading

Verified on `17:30:52` cron tick